Geometric

We can use the base plot functions in R to create a plot of the pmf for a geometric random variable, \(X\), with parameter \(p\).

  x <- 0:25
  ### Note that R uses the number of failures to the first success not the number of trials
  plot(x, dgeom(x-1,0.1), lty=1, col=1, type="h", xlab="x", ylab="p(x)")

The CDF may be plotted analogously.

  x <- 0:25
  plot(x, pgeom(x-1, 0.1), lty=1, col=1, type="s", xlab="x", ylab="F(x)", ylim=c(0,1))
  lines(x, pgeom(x-1, 0.5), lty=2, col=2, type="s")
  lines(x, pgeom(x-1, 0.9), lty=3, col=3, type="s")

Note that the CDF is defined for all \(x\). However, because the pmf adds mass at only integer values, the CDF is not continuous itself. The CDF is non-decreasing, but it is flat for \(i \le x < i+1\) since it jumps up by \(p(i+1)\) at \(x=i+1\). This can be seen in the plot below.

  x <- 0:25
  plot(x, pgeom(x-1, 0.1), lty=1, col=1, type="s", xlab="x", ylab="F(x)", ylim=c(0,1))
  abline(v=5, lty=1, col="green")
  abline(v=6, lty=2, col="red")

We see that for \(5 \le x < 6\) , \(F(x)=\) 0.468559 . At \(x=6\), \(F\) jumps up by \(p(6)=\) 0.0531441 and for \(6 \le x < 7\) the CDF is \(F(x) = F(5) + p(6)=\) 0.5217031 .

We can use the ggplot2 package to enhance the plot. The plot below is a better representation of the CDF step-function.

  x <- 0:25
  y <- pgeom(x-1, 0.1)
  df <- data.frame(x, y)
  df$xend <- c(df$x[2:nrow(df)], NA)
  df$yend <- df$y
p <- (ggplot(df, aes(x=x, y=y, xend=xend, yend=yend)) +
      ylab("F(x)") + 
      geom_vline(aes(xintercept=x), linetype=2, color="grey") +
      geom_point() +  # Solid points to left
      geom_point(aes(x=xend, y=y), shape=1) +  # Open points to right
      geom_segment())  # Horizontal line segments
p
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_segment).